SIB-BLAST: a web server for improved delineation of true and false positives in PSI-BLAST searches
نویسندگان
چکیده
A SIB-BLAST web server (http://sib-blast.osc.edu) has been established for investigators to use the SimpleIsBeautiful (SIB) algorithm for sequence-based homology detection. SIB was developed to overcome the model corruption frequently observed in the later iterations of PSI-BLAST searches. The algorithm compares resultant hits from the second iteration to the final iteration of a PSI-BLAST search, calculates the figure of merit for each 'overlapped' hit and re-ranks the hits according to their figure of merit. By validating hits generated from the last profile against hits from the first profile when the model is least corrupted, the true and false positives are better delineated, which in turn, improves the accuracy of iterative PSI-BLAST searches. Notably, this improvement to PSI-BLAST comes at minimal computational cost as SIB-BLAST utilizes existing results already produced in a PSI-BLAST search.
منابع مشابه
Strategies for the effective identification of remotely related sequences in multiple PSSM search approach.
Searches using position specific scoring matrices (PSSMs) have been commonly used in remote homology detection procedures such as PSI-BLAST and RPS-BLAST. A PSSM is generated typically using one of the sequences of a family as the reference sequence. In the case of PSI-BLAST searches the reference sequence is same as the query. Recently we have shown that searches against the database of multip...
متن کاملPSI-Search: iterative HOE-reduced profile SSEARCH searching
UNLABELLED Iterative similarity searches with PSI-BLAST position-specific score matrices (PSSMs) find many more homologs than single searches, but PSSMs can be contaminated when homologous alignments are extended into unrelated protein domains-homologous over-extension (HOE). PSI-Search combines an optimal Smith-Waterman local alignment sequence search, using SSEARCH, with the PSI-BLAST profile...
متن کاملCascade PSI-BLAST web server: a remote homology search tool for relating protein domains
Owing to high evolutionary divergence, it is not always possible to identify distantly related protein domains by sequence search techniques. Intermediate sequences possess sequence features of more than one protein and facilitate detection of remotely related proteins. We have demonstrated recently the employment of Cascade PSI-BLAST where we perform PSI-BLAST for many 'generations', initiatin...
متن کاملHidden Markov models for detecting remote protein homologies
MOTIVATION A new hidden Markov model method (SAM-T98) for finding remote homologs of protein sequences is described and evaluated. The method begins with a single target sequence and iteratively builds a hidden Markov model (HMM) from the sequence and homologs found using the HMM for database search. SAM-T98 is also used to construct model libraries automatically from sequences in structural da...
متن کاملImproving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.
PSI-BLAST is an iterative program to search a database for proteins with distant similarity to a query sequence. We investigated over a dozen modifications to the methods used in PSI-BLAST, with the goal of improving accuracy in finding true positive matches. To evaluate performance we used a set of 103 queries for which the true positives in yeast had been annotated by human experts, and a pop...
متن کامل